Providing efficient, extensible and adaptive intra-query parallelism for advanced applications
نویسنده
چکیده
Parallel execution offers a solution to the problem of reducing the response time of object-rela-tional queries against large databases. A database management system answers a query by first finding a procedural plan to execute the query and subsequently executing the plan to produce the query result. In this thesis we address all significant levels of the query processing architecture in order to provide a comprehensive approach to the problem of efficient intra-query paral-lelism. Thereby, we develop optimization and parallelization algorithms using models that incorporate the sources of parallelism as well as obstacles to achieve speedup. To reduce its inherent complexity , we have split parallelization into several phases, each phase concentrating on particular aspects of parallel query execution. This rule-and cost-based approach guarantees both exten-sibility as well as effectiveness. Adaptability to diverse application domains and architectural characteristics are provided by means of appropriate parameter settings. The proposed strategies have been implemented and evaluated within the parallel object-rela-tional DBMS prototype MIDAS. The results show that the presented approach is particularly suitable for the parallelization of large and complex queries, as can be found in upcoming applications such as data warehouses, digital libraries or stream analysis.
منابع مشابه
Volcano - An Extensible and Parallel Query Evaluation System
To investigate the interactions of extensibility and parallelism in database query processing, we have developed a new dataflow query execution system called Volcano. The Volcano effort provides a rich environment for research and education in database systems design, heuristics for query optimization, parallel query execution, and resource allocation. Volcano uses a standard interface between ...
متن کاملNon-zero probability of nearest neighbor searching
Nearest Neighbor (NN) searching is a challenging problem in data management and has been widely studied in data mining, pattern recognition and computational geometry. The goal of NN searching is efficiently reporting the nearest data to a given object as a query. In most of the studies both the data and query are assumed to be precise, however, due to the real applications of NN searching, suc...
متن کاملEfficient Index-based Processing of Join Queries in DHTs
Massively distributed applications require the integration of heterogeneous data from multiple sources. Peer-to-peer (P2P) is one possible network model for these distributed applications and among P2P architectures, distributed hash table (DHT) is well known for its routing performance guarantees. Under a general distributed relational data model, join query operator, an essential component to...
متن کاملApuama: Combining Intra-query and Inter-query Parallelism in a Database Cluster
Database clusters provide a cost-effective solution for high performance query processing. By using either interor intra-query parallelism on replicated data, they can accelerate individual queries and increase throughput. However, there is no database cluster that combines interand intra-query parallelism while supporting intensive update transactions. C-JDBC is a successful database cluster t...
متن کاملParallel Query Processing
With relations growing larger and queries becoming more complex, parallel query processing is an increasingly attractive option for improving the performance of database systems. The objective of this paper is to examine the various issues encountered in parallel query processing and the techniques available for addressing these issues. The focus of the paper is on the join operation with both ...
متن کامل